Discourse knowledge in device independent document formatting
نویسندگان
چکیده
Most document structures define layout structures which implicitly define semantic relationships between content elements. While document structures for text are well established (books, reports, papers etc.), models for time based documents such as multimedia and hypermedia are relatively new and lack established document structures. Traditional document description languages convey domain-dependent semantic relationships implicitly, using domain-independent mark-up for expressing layout. This works well for textual documents a,s for example, CSS and HTML demonstrate. True device independence, however, sometimes requires a change of document model to maintain the content semantics. To achieve this we need explicit information about the discourse role of the content element. We propose a model in which content is marked-up with the discourse role it plays in the document. This way the formatter has knowledge about the function of a content element so it can make appropriate lay out choices.
منابع مشابه
Models and Languages for Formatted Documents
The largest change that has come to the world of document formatting since TEX’s DVI language was designed is the need to support documents destined for multiple uses, e.g., for interactive reading on screen and for paper output. It is time to investigate what is needed, both now and in the immediate future, from a device-independent description language for formatted documents. This paper does...
متن کاملMerging Logical and Physical Structures in Documents
Although it is well established that structured documents and generic models bring benefits to applications involving documents, integrating these document models in the formatting process of interactive editors is still an open problem. In this paper, the problem of laying out and formatting structured documents is investigated, taking into account the DSSSL standard. One key point of this mod...
متن کاملA Bayesian Network Approach to Semantic Labelling of Text Formatting in XML Corpora of Documents
The wide-spread applications of document digitization have lead to the use of structured digital representation methods such as the XML language. Extraction methodologies for the formatting metadata can be used on such structured documents for enhancing their accessibility, including augmented audio representation of documents. To the best of our knowledge, an effort has yet to be made to produ...
متن کاملMethodology for Validation of Issuance of Mystical and Ethical Narrations (A Case Study and Discourse Analysis on the Methodology of the Book Sirr ul-asra’)
The Book “the Secret of Prophet Mohammad’s Midnight Journey to the Seven Heavens in Explanation of Al-Mi’raj Hadith” is written by Ayatollah Sa’adatparvar. Analyzing the discourse of a part of its introduction, his recognition method about this hadith has been investigated in this paper. The paper aims at investigating the particular discourse pattern of the author in analyzing the document of ...
متن کاملThe Representation of Social Actors in the Graduate Employability Issue: Online News and the Government Document
This paper presents the first part of a larger study on the issue of graduate employability in Malaysia as construed in public discourse in English, a language of power in Malaysia. The term employability itself has many definitions depending on the requirements of government and industry, and in the case of Malaysia, the English-language ability of graduates is inseparable from graduate employ...
متن کامل